Effects of Random Sampling on SVM Hyper-parameter Tuning

نویسندگان

  • Tomás Horváth
  • Rafael Gomes Mantovani
  • André Carlos Ponce de Leon Ferreira de Carvalho
چکیده

Hyper-parameter tuning is one of the crucial steps in the successful application of machine learning algorithms to real data. In general, the tuning process is modeled as an optimization problem for which several methods have been proposed. For complex algorithms, the evaluation of a hyper-parameter configuration is expensive and their runtime is speed up through data sampling. In this paper, the effect of sample sizes to the results of hyper-parameter tuning process is investigated. Hyperparameters of Support Vector Machines are tuned on samples of different sizes generated from a dataset. Hausdorff distance is proposed for computing the differences between the results of hyper-parameter tuning on two samples of different size. 100 real-world datasets and two tuning methods (Random Search and Particle Swarm Optimization) are used in the experiments revealing some interesting relations between sample sizes and results of hyper-parameter tuning which open some promising directions for future investigation in this direction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parameter Tuning via Kernel Matrix Approximation for Support Vector Machine

Parameter tuning is essential to generalization of support vector machine (SVM). Previous methods usually adopt a nested two-layer framework, where the inner layer solves a convex optimization problem, and the outer layer selects the hyper-parameters by minimizing either cross validation or other error bounds. In this paper, we propose a novel parameter tuning approach for SVM via kernel matrix...

متن کامل

Investigating Exploratory Capabilities of Uncertainty Sampling using SVMs in Active Learning

Active learning provides a solution for annotating huge pools of data efficiently to use it for mining and business analytics. Therefore, it reduces the number of instances that have to be annotated by an expert to the most informative ones. A common approach is to use uncertainty sampling in combination with a support vector machine (SVM). Some papers argue that uncertainty sampling performs b...

متن کامل

Practical selection of SVM parameters and noise estimation for SVM regression

We investigate practical selection of hyper-parameters for support vector machines (SVM) regression (that is, epsilon-insensitive zone and regularization parameter C). The proposed methodology advocates analytic parameter selection directly from the training data, rather than re-sampling approaches commonly used in SVM applications. In particular, we describe a new analytical prescription for s...

متن کامل

On the overestimation of random forest’s out-of-bag error

Background The ensemble method random forests has become a popular classification tool in bioinformatics and related fields. The out-of-bag error is an error estimation technique which is often used to evaluate the accuracy of a random forest as well as for selecting appropriate values for tuning parameters, such as the number of candidate predictors that are randomly drawn for a split, referre...

متن کامل

Image Classification Based on KPCA and SVM with Randomized Hyper-parameter Optimization

Image classification is one of the most fundamental and useful activities in computer vision domain. For better accuracy and executing efficiency under the circumstance of high dimensional feature descriptors in image classification, we proposes a novel framework for multi-class image classification based on kernel principal component analysis(KPCA) for feature descriptors post-processing and s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016